Skip to content

Add a column participant to room_memberships table #18068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Mar 18, 2025

Conversation

H-Shay
Copy link
Contributor

@H-Shay H-Shay commented Jan 7, 2025

Adds a column participant to room_memberships table to track whether a user has participated in a room - participation is defined as having sent a m.room.message or m.room.encrypted event into the room.

Related to #18040, the approach there to determine room participation was deemed too inefficient and adding this column was the recommend remedy.

@H-Shay H-Shay requested a review from a team as a code owner January 7, 2025 20:04
@H-Shay H-Shay changed the title Add a column participant to room_membership table Add a column participant to room_memberships table Jan 7, 2025
@H-Shay
Copy link
Contributor Author

H-Shay commented Jan 7, 2025

Requesting @anoadragon453's feedback on this as he has context.

@H-Shay H-Shay marked this pull request as draft January 7, 2025 20:20
@erikjohnston erikjohnston requested a review from a team January 8, 2025 11:08
@H-Shay
Copy link
Contributor Author

H-Shay commented Jan 13, 2025

Delayed events complement test failure looks like a flake?

@H-Shay H-Shay requested a review from anoadragon453 January 13, 2025 04:58
@H-Shay
Copy link
Contributor Author

H-Shay commented Jan 21, 2025

@anoadragon453 just wondering if I could get your input on this to ensure it aligns with what we discussed in #18040?

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@H-Shay apologies for the slow turn-around on this one. I've been swamped with other work :(

I've made some large suggestions below, I'm afraid, but overall this is heading in a good direction!

txn: LoggingTransaction, last_room_id: str
) -> Optional[str]:
sql = """
SELECT room_id from room_memberships WHERE room_id > ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that room_memberships is a table that is always be appended to, and thus always changing under you. It is ordered by its event_stream_ordering column. So, the only way to traverse it while the system is running, without leaving gaps, is to iterate using the event_stream_ordering column.

Note: room_memberships only has rows deleted from it when a room is purged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have:

  • a query to pull out a room ID
  • a query that pulls out all users that have ever been joined to that room
  • a query per-user per-room that updates all previous entries that match that user/room combination

I think we can instead do this in one query that processes a batch of room_membership rows all at once. Instead of saving the current room_id for the batch job, start with the currently max event_stream_ordering row and work backwards in batches of say 1000.

Constrain your query to the current event_stream_ordering - BATCH_SIZE. Then within that, UPDATE all rows based on data in the events table. Then save the new event_stream_ordering - BATCH_SIZE to your background job.

Now the table can continue to grow without things changing from underneath you, as historical data is only (rarely) deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look - I have rewritten the query as you requested. I did test it on a demo server with some trivial data and it worked, but do make sure I've got it correct as I don't often write SQL queries that aren't mostly simple.

@H-Shay
Copy link
Contributor Author

H-Shay commented Feb 4, 2025

complement failures look like a flake

@H-Shay H-Shay requested a review from anoadragon453 February 4, 2025 21:31
Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly there now, just a couple of minor things. Thanks for being patient!

@H-Shay H-Shay requested a review from anoadragon453 February 12, 2025 23:48
@H-Shay
Copy link
Contributor Author

H-Shay commented Mar 5, 2025

@anoadragon453 just checking this is still on your radar? No rush, just want to make sure it didn't get lost

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge apologies @H-Shay, this one indeed fell through the cracks.

Some comments below on the historical background job, but otherwise LGTM.

@H-Shay
Copy link
Contributor Author

H-Shay commented Mar 11, 2025

@anoadragon453 thanks for the review, I just had a question about this statement:

This background update will always mark those rows as participant=false if the user isn't currently joined to the room. And it will mark participant=true if they sent a message at any point in the room's history, not just when they were joined that one time.

I think what that is saying is that if the user has sent a message at any point in the room's history they will be set as a participant, unless they are currently not in the room - is that correct thus far? Because if they aren't currently in the room, they will be marked as participant=false, and if they are currently in the room and they ever sent a message they will be marked as participant=true, regardless of whether they have joined/left in the interm? This is how I would expect it to work but I think there is an edge case you are pointing out and I am missing - would you be willing to elaborate?

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anoadragon453 thanks for the review, I just had a question about this statement:

This background update will always mark those rows as participant=false if the user isn't currently joined to the room. And it will mark participant=true if they sent a message at any point in the room's history, not just when they were joined that one time.

I think what that is saying is that if the user has sent a message at any point in the room's history they will be set as a participant, unless they are currently not in the room - is that correct thus far?

That's correct!

Because if they aren't currently in the room, they will be marked as participant=false, and if they are currently in the room and they ever sent a message they will be marked as participant=true, regardless of whether they have joined/left in the interm? This is how I would expect it to work but I think there is an edge case you are pointing out and I am missing - would you be willing to elaborate?

This harkens back to #18068 (comment), where you could get information about whether they were a participant at the time. However, you had indicated that wasn't too useful for T&S purposes, and I think I had forgotten that.

At this point, I would just mention that the data shouldn't be relied on to produce historical participation data.

@H-Shay H-Shay requested a review from anoadragon453 March 17, 2025 20:42
@H-Shay
Copy link
Contributor Author

H-Shay commented Mar 17, 2025

@anoadragon453 thanks for the answer and thanks for bearing with this - I know you are busy :) I've made the changes so hopefully we are close to done with this.

To allow to easily adding more test cases.
Allow user2 to send state events into the room.
Even if the type of the state event tries to replicate that of a
non-state message or encrypted event.
Users can send stickers and not be considered a participant.
Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now (assuming tests pass)! Thank you for your hard work and patience @H-Shay.

I added some commits that parameterised the test cases (making it trivial to add more), as well as added a conditional that the event sent by the user is not a state event.

I also added a test case showing that users can send other event types (such as stickers) into the room and not be marked as a participant.

Before I merge, can you confirm that you're still happy to constrain participation to only m.room.message and m.room.encrypted? I'm wondering if event types like m.reaction, m.sticker, m.call.invite and m.poll.start should be added as well.

@H-Shay
Copy link
Contributor Author

H-Shay commented Mar 18, 2025

@anoadragon453 that is correct, thank you so much!

@anoadragon453 anoadragon453 merged commit 4b8dbe2 into element-hq:develop Mar 18, 2025
39 checks passed
@anoadragon453
Copy link
Member

Wonderful, thanks again for your work here!

erikjohnston added a commit that referenced this pull request Apr 16, 2025
erikjohnston added a commit that referenced this pull request Apr 16, 2025
Follow on from #18068

Currently the subquery in `UPDATE` is pointless, as it will still just
update all `room_membership` rows. Instead, we should look at the
current membership event ID (which is easily retrieved from
`local_current_membership`). We also add a `AND NOT participant` to noop
the `UPDATE` when the `participant` flag is already set.

cc @H-Shay
AND room_memberships.event_stream_ordering <= ?
AND room_memberships.event_stream_ordering > ?;
"""
batch = int(stream_token) - _POPULATE_PARTICIPANT_BG_UPDATE_BATCH_SIZE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that we're not using the batch_size parameter?

This background update currently does not autoscale down. I wonder if this might be part of the cause of pain for many SQLite users: #18325

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not. I missed that in the review. I'll write a quick PR to use that, good spot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants